Added support to rotate in fp32 (optional) by kinjalpatel27 · Pull Request #885 · NVIDIA/Model-Optimizer

kinjalpatel27 · 2026-02-12T22:57:54Z

What does this PR do?

Type of change: New Feature

Overview:
This MR adds support to perform rotation for RHT in float32 if enabled by quantization configuration. It also makes rotate argument in quantization configuration of type bool (for backward compatibility) or dict (added option for float32 rotation)

Usage

python hf_ptq.py --pyt_ckpt_path meta-llama/Llama-3.2-3B-Instruct --qformat nvfp4 --export_fmt hf --dataset cnn_dailymail --export_path test --trust_remote_code --inference_pipeline_parallel 1 --batch_size 1 --calib_size 4 --kv_cache_qformat nvfp4_rotate

Updated NVFP4_KV_ROTATE_CFG locally with "rotate": {"enable": True, "rotate_fp32": True}

...
model.layers.27.self_attn.k_bmm_quantizer                                        TensorQuantizer((2, 1) bit fake block_sizes={-1: 16, 'type': 'dynamic', 'scale_bits': (4, 3)}, amax=8.3750 rotated (fp32) calibrator
=MaxCalibrator quant) 
...

Updated NVFP4_KV_ROTATE_CFG locally with "rotate": {"enable": True, "rotate_fp32": False}

model.layers.27.self_attn.k_bmm_quantizer                                        TensorQuantizer((2, 1) bit fake block_sizes={-1: 16, 'type': 'dynamic', 'scale_bits': (4, 3)}, amax=8.3750 rotated calibrator=MaxCalibrator quant)

Testing

Updated unit test in tests/gpu/torch/quantization/test_hadamard.py

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed.
Is this change backward compatible?: Yes
Did you write any new necessary tests?: No (updated existing test)
Did you add or update any necessary documentation?: Yes
Did you update Changelog?: Yes

Additional Information

Summary by CodeRabbit

Release Notes

New Features
- Added rotational input capability prior to quantization for RHT (Rotated Hyperplane Transform).
- Introduced granular rotation configuration options enabling FP32 casting for improved numerical stability during transforms.
Tests
- Expanded test coverage for rotation functionality with parameterized FP32 casting scenarios.

coderabbitai · 2026-02-12T22:58:10Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

This change introduces granular rotation configuration for the quantization preprocessing pipeline. The rotate parameter now accepts either a boolean or a dictionary with "enable" and "rotate_fp32" keys, enabling independent control of rotation functionality and floating-point precision casting during Hadamard transforms.

Changes

Cohort / File(s)	Summary
Configuration Updates `CHANGELOG.rst`, `modelopt/torch/quantization/config.py`	Updated rotate field in QuantizerAttributeConfig to support `bool \| dict[str, bool]` type, with documentation for dict keys "enable" and "rotate_fp32". Changelog entry documents RHT rotational input capability.
Hadamard Transform Implementation `modelopt/torch/quantization/nn/functional.py`, `modelopt/torch/quantization/nn/modules/tensor_quantizer.py`	Added optional `rotate_fp32` parameter to `normalized_hadamard_transform()` for conditional FP32 casting during transforms. Updated tensor_quantizer to extract rotation settings from dict-based config and pass rotate_fp32 to the transform function.
Test Coverage `tests/gpu/torch/quantization/test_hadamard.py`	Parameterized `test_kv_rotate()` to validate both `rotate_fp32=True` and `rotate_fp32=False` execution paths with tolerance adjustments per path.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~15 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 71.43% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Added support to rotate in fp32 (optional)' directly reflects the main change in the changeset: extending rotation support to include optional fp32 mode during quantization preprocessing.
Merge Conflict Detection	✅ Passed	✅ No merge conflicts detected when merging into `main`

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch kinjal/rotate_float

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

modelopt/torch/quantization/nn/modules/tensor_quantizer.py

codecov · 2026-02-12T23:08:58Z

Codecov Report

❌ Patch coverage is 64.70588% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 73.09%. Comparing base (02fa362) to head (e9d3b66).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
modelopt/torch/quantization/nn/functional.py	16.66%	5 Missing ⚠️
.../torch/quantization/nn/modules/tensor_quantizer.py	90.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #885      +/-   ##
==========================================
- Coverage   73.11%   73.09%   -0.02%     
==========================================
  Files         205      205              
  Lines       22281    22292      +11     
==========================================
+ Hits        16291    16295       +4     
- Misses       5990     5997       +7

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

CHANGELOG.rst

modelopt-bot

Code Review: PR #885 - Added support to rotate in fp32 (optional)

Overview

This PR adds an optional FP32 precision mode for the Hadamard rotation transform used in RHT (Rotated Hyperplane Transform) quantization methods like QuaRot/SpinQuant. The change is backward compatible.

✅ Strengths

Aspect	Assessment
Backward Compatibility	Excellent. The `rotate` config now accepts `bool \| dict[str, bool]` — existing boolean configs continue to work unchanged.
Code Organization	Clean property abstraction in `tensor_quantizer.py` — `rotate_is_enabled` and `rotate_is_fp32` handle the type polymorphism well.
Test Coverage	`test_kv_rotate` is now parametrized for both `rotate=True` and `rotate={"enable": True, "rotate_fp32": True}`. Also added FP32 test in `test_hadamard_transform`.
Documentation	Changelog updated, config docstrings are clear, and the PR description includes usage examples.
Typo Fix	Caught and fixed "ratation" → "rotation" in the config description.

⚠️ Points to Consider

1. Numerical Precision in Mixed Scenarios
In functional.py, the sqrt calculation uses torch.float32 tensor, but if rotate_fp32=False and input is FP16/BF16, the division happens in the input's dtype. This is existing behavior, but worth noting that the denominator's precision differs based on mode.

2. Property Logic in rotate_is_fp32

return (
    self._rotate.get("rotate_fp32", False)
    if isinstance(self._rotate, dict) and self.rotate_is_enabled
    else False
)

The check self.rotate_is_enabled inside rotate_is_fp32 is redundant since you already check isinstance(self._rotate, dict). If it's a dict with "enable": False, returning False either way is fine, but consider simplifying to just check isinstance.

3. Test Assertions
Both FP32 and non-FP32 tests use the same atol=0.05. This is reasonable given the comment about 16-bit float errors, but consider if tighter tolerance is expected for FP32 mode.

📝 Minor Nits

The CHANGELOG.rst entry is brief. Consider slightly more detail: "Add support for FP32 rotation computation in RHT quantization (QuaRot/SpinQuant) when specified in rotation config."

Verdict

LGTM — This is a solid, well-tested feature addition with proper backward compatibility. The code is clean and the documentation is thorough. Ready to merge after addressing any optional nits.

modelopt-bot

See detailed feedback in inline comments below.

modelopt/torch/quantization/nn/modules/tensor_quantizer.py

modelopt/torch/quantization/nn/functional.py

tests/gpu/torch/quantization/test_hadamard.py

Signed-off-by: Kinjal Patel <kinjalpravin@nvidia.com>

kinjalpatel27 requested a review from a team as a code owner February 12, 2026 22:57

kinjalpatel27 requested a review from cjluo-nv February 12, 2026 22:57

realAsma reviewed Feb 12, 2026

View reviewed changes

modelopt/torch/quantization/nn/modules/tensor_quantizer.py Outdated Show resolved Hide resolved

realAsma approved these changes Feb 12, 2026

View reviewed changes

kinjalpatel27 force-pushed the kinjal/rotate_float branch from 57f1766 to 044a4ab Compare February 13, 2026 18:52

cjluo-nv reviewed Feb 14, 2026

View reviewed changes

CHANGELOG.rst Outdated Show resolved Hide resolved

modelopt-bot reviewed Feb 14, 2026

View reviewed changes

modelopt/torch/quantization/nn/modules/tensor_quantizer.py Show resolved Hide resolved

modelopt/torch/quantization/nn/functional.py Show resolved Hide resolved

tests/gpu/torch/quantization/test_hadamard.py Show resolved Hide resolved

kinjalpatel27 added 7 commits February 23, 2026 10:24

added support for rotate in fp32

fae3a70

Signed-off-by: Kinjal Patel <kinjalpravin@nvidia.com>

updated changelog

7f37ed8

Signed-off-by: Kinjal Patel <kinjalpravin@nvidia.com>

minor

071ca71

Signed-off-by: Kinjal Patel <kinjalpravin@nvidia.com>

minor

0de9ed0

Signed-off-by: Kinjal Patel <kinjalpravin@nvidia.com>

minor

39e5733

Signed-off-by: Kinjal Patel <kinjalpravin@nvidia.com>

minor

f0c76e3

Signed-off-by: Kinjal Patel <kinjalpravin@nvidia.com>

updated bounds

126570f

Signed-off-by: Kinjal Patel <kinjalpravin@nvidia.com>

kinjalpatel27 force-pushed the kinjal/rotate_float branch from 9920a1e to 126570f Compare February 23, 2026 10:35

kinjalpatel27 requested a review from cjluo-nv February 23, 2026 10:37

minor

e9d3b66

Signed-off-by: Kinjal Patel <kinjalpravin@nvidia.com>

kinjalpatel27 merged commit df47e81 into main Feb 23, 2026
49 of 51 checks passed

kinjalpatel27 deleted the kinjal/rotate_float branch February 23, 2026 19:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Added support to rotate in fp32 (optional)#885

Added support to rotate in fp32 (optional)#885
kinjalpatel27 merged 8 commits intomainfrom
kinjal/rotate_float

kinjalpatel27 commented Feb 12, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 12, 2026 •

edited

Loading

Review skipped

Walkthrough

Changes

Estimated code review effort

Uh oh!

Uh oh!

codecov bot commented Feb 12, 2026 •

edited

Loading

Uh oh!

Uh oh!

modelopt-bot left a comment

Uh oh!

modelopt-bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments

Conversation

kinjalpatel27 commented Feb 12, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Estimated code review effort

Uh oh!

Uh oh!

codecov bot commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

modelopt-bot left a comment

Choose a reason for hiding this comment

Code Review: PR #885 - Added support to rotate in fp32 (optional)

Overview

✅ Strengths

⚠️ Points to Consider

📝 Minor Nits

Verdict

Uh oh!

modelopt-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kinjalpatel27 commented Feb 12, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 12, 2026 •

edited

Loading

codecov bot commented Feb 12, 2026 •

edited

Loading